A framework for dialogue data collection with a simulated ASR channel
نویسندگان
چکیده
The application of machine learning methods to the dialogue management component of spoken dialogue systems is a growing research area. Whereas traditional methods use handcrafted rules to specify a dialogue policy, machine learning techniques seek to learn dialogue behaviours from a corpus of training data. In this paper, we identify the properties of a corpus suitable for training machine-learning techniques, and propose a framework for collecting dialogue data. The approach is akin to a “Wizard of Oz” set-up with a “wizard” and a “user”, but introduces several novel variations to simulate the ASR communication-channel. Specifically, a turn-taking model common in spoken dialogue system is used, and rather than hearing the user directly, the wizard sees simulated speech recognition results on a screen. The simulated recognition results are produced with an error-generation algorithm which allows the target WER to be adjusted. An evaluation of the algorithm is presented.
منابع مشابه
On-Line Learning of a Persian Spoken Dialogue System Using Real Training Data
The first spoken dialogue system developed for the Persian language is introduced. This is a ticket reservation system with Persian ASR and NLU modules. The focus of the paper is on learning the dialogue management module. In this work, real on-line training data are used during the learning process. For on-line learning, the effect of the variations of discount factor (g) on the learning speed...
متن کاملCharacterizing Task-Oriented Dialog using a Simulated ASR Channel
We describe a data collection consisting of task-oriented human-human conversations in a simulated ASR channel in which the WER is systematically varied. We find that users infrequently give a direct indication of having been misunderstood; levels of expert “initiative” increase with WER primarily due to increased grounding activity; and asking task-related questions appears to be a more succes...
متن کاملOn-Line Learning of a Persian Spoken Dialogue System Using Real Training Data
The first spoken dialogue system developed for the Persian language is introduced. This is a ticket reservation system with Persian ASR and NLU modules. The focus of the paper is on learning the dialogue management module. In this work, real on-line training data are used during the learning process. For on-line learning, the effect of the variations of discount factor (g) on the learning speed...
متن کاملCharacterizing task-oriented dialog using a simulated ASR chanel
We describe a data collection consisting of task-oriented human-human conversations in a simulated ASR channel in which the WER is systematically varied. We find that users infrequently give a direct indication of having been misunderstood; levels of expert “initiative” increase with WER primarily due to increased grounding activity; and asking task-related questions appears to be a more succes...
متن کاملParametric study of a viscoelastic RANS turbulence model in the fully developed channel flow
One of the newest of viscoelastic RANS turbulence models for drag reducing channel flow with polymer additives is studied in different flow and rheological properties. In this model, finitely extensible nonlinear elastic-Peterlin (FENE-P) constitutive model is used to describe the viscoelastic effect of polymer solution and turbulence model is developed in the k-ϵ-(ν^2 ) ̅-f framework. The geome...
متن کامل